natural_language_understandingfandomcom-20200214-history
Synset embedding
Knowledge base The key idea is that good embeddings to predict triples in a knowledge base are those that cluster similar synsets together. By training a model on knowledge base completion task, researchers hope to find good synset embeddings as byproduct. Given a triple, e.g. (cat, has_part, tail), let l'' denotes the left synset, ''r is the right synset and t'' is the relation between them. In the literature, there are two ways to formalize this task: * ''Margin-based: A scoring function g'' should score observed triples greater than random triples at least to some margin, hence: maximize L+R where in ''L, left entities are replaced by random entities L = \sum min(g(l, t, r) - g(l', t, r) - margin, 0) while in R'', right entities are randomized R = \sum min(g(l, t, r) - g(l, t, r') - margin, 0) . * ''Negative sampling: A function f'' is trained to differentiate between a triple drawn from correct distribution ''D and one drawn from a uniform distribution N''. For example, f(\cdot) stands for the probability that a triple comes from ''D then we minimize negative log-likelihood: \sum_{D} -\log f(l, t, r) + \sum_{N} -\log (1 - f(l', t', r')) . Many models have been proposed (see Yang et al., 2014Yang, B., Yih, W., He, X., Gao, J., & Deng, L. (2014). Embedding Entities and Relations for Learning and Inference in Knowledge Bases, 12. Computation and Language. Retrieved from http://arxiv.org/abs/1412.6575 for a review): # Unstructured''A. Bordes, X. Glorot, J. Weston, and Y. Bengio. A semantic matching energy function for learning with multi-relational data. Machine Learning, 2013.: Treat all relations indifferently g(l, t, r) = |l-r|_p # ''RESCAL''M. Nickel, V. Tresp, and H.-P. Kriegel. A three-way model for collective learning on multi-relational data. In Proceedings of the 28th International Conference on Machine Learning (ICML), 2011.: TODO # ''SE''A. Bordes, J.Weston, R. Collobert, and Y. Bengio. Learning structured embeddings of knowl- edge bases. In Proceedings of the 25th Annual Conference on Artificial Intelligence (AAAI), 2011.: A relation is represented by ''two matrices, working as linear transformation on each side: g(l, t, r) = |l.t_L - r.t_R|_p # SME(LINEAR): A relation is represented by two vectors, the semantic matching energy function compare two sides: g(l, t, r) = (w_{L1}.l + w_{L2}.t_L + b_l)(w_{R1}.r + w_{R2}.t_R + b_R) # SME(BILINEAR): The representation of relations stays the same but weights are rank 3 tensors: g(l, t, r) = ((w_L \times t_L).l)((w_R \times t_R).r) # ''LFM''R. Jenatton, N. Le Roux, A. Bordes, G. Obozinski, et al. A latent factor model for highly multi-relational data. In Advances in Neural Information Processing Systems (NIPS 25), 2012.: TODO # ''TransE''Bordes, A., Usunier, N., Weston, J., & Yakhnenko, O. (2013). Translating Embeddings for Modeling Multi-relational Data. In N (pp. 1–9).: A relation is a translation of the left hand side to the right hand side: g(l,t,r) = |l+t-r|_p # TransMMiao Fan, Qiang Zhou, Emily Chang, Thomas Fang Zheng. Transition-based Knowledge Graph Embedding with Relational Mapping Properties. PACLIC'14: Scale scores according to its relation: g(l,t,r) = w_T|l+t-r|_p # ''Neural tensor network''R. Socher, D. Chen, C. D. Manning, and A. Y. Ng. Learning new facts from knowledge bases with neural tensor networks and semantic word vectors. In Advances in Neural Information Processing Systems (NIPS 26), 2013.: Relations are represented as rank-3 tensors, score consists of two parts: "tensor" part is the tensor product of them with a relation and "neural" part adds up linear combination of entities: g(l, t, r)=l.t.r + w_L.l + w_R.r Knowledge base + Text Wang et al. (2014)Wang, Z., Zhang, J., Feng, J., & Chen, Z. (2014). Knowledge Graph and Text Jointly Embedding. In The 2014 Conference on Empirical Methods on Natural Language Processing. ACL – Association for Computational Linguistics. Retrieved from http://research.microsoft.com/apps/pubs/default.aspx?id=228269 created two models for entities and words and align them by Wikipedia anchors or the name of entities. Bordes et al. (2012)Bordes, A., & Weston, J. (2012). Joint Learning of Words and Meaning Representations for Open-Text Semantic Parsing, 22, 127–135. trains embeddings on knowledge base completion and word sense disambiguation tasks simultaneously therefore make use of both knowledge bases and corpora. References Category:Distributed representation